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Abstract 

This study focused on the issue of measurement reliability and its attenuation on 
correlation between two composites, and two seemingly different approaches for correcting the 
attenuation. As expected, correlation coefficients uncorrected for measurement error are 
systematically biased downward. For the data conditions examined, the two correction 
approaches provided not only near identical and unbiased means , but also near identical 
confidence intervals for the sampling distribution of the corrected correlation coefficients. The 
highly comparable results from the two approaches suggest that these two approaches work 
equally well for these data. It is pointed out that the CFA modeling approach may be less 
applicable because of more difficult data conditions at the item level in research practice. The 
findings point to the importance of reporting measurement reliability information whenever 
possible. The findings further suggest that correction for attenuation should be considered when 
information about score reliability is available. 

Keywords: correlation, reliability, attenuation, confirmatory factor analysis 
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It is well-known that unreliability in measurement attenuates the statistical relationship 
between two composites (e.g., Crocker & Algina, 1986; Worthen, White, Fan, & Sudweeks, 
1999). Two approaches have been discussed for correcting such attenuation caused by 
measurement error. The traditional approach is typically discussed within the context of 
measurement reliability and validity, and sample score reliability coefficients of the two 
composites of interest are usually used for algebraically correcting the attenuation of correlation 
cause by measurement unreliability (e.g., Crocker & Algina, 1986; Gulliksen, 1987). The second 
approach is often discussed within the context of confirmatory factor analysis, or more broadly, 
structural equation modeling, in which the measurement errors are explicitly modeled, and 
measurement-error-free correlation between two composites (or factors, latent variables) is thus 
obtained (Joreskog & Sorbom, 1989; Loehlin, 1992). It is not clear from the literature how 
comparable the results from the two seemingly different approaches are. The purpose of this 
paper is to present the results of an empirical study in which these two approaches for correcting 
such attenuation on correlation coefficient are systematically compared with each other, and with 
the correlation coefficients uncorrected such attenuation. 

Traditional Approach for Correcting the Attenuation 

In classical test theory, the issue of attenuation of correlation between two composites 
caused by measurement unreliability is usually discussed within the context of score reliability and 
validity. More specifically, if there are two measured variables X and Y, their correlation is 
estimated by the Pearson correlation coefficient i^cy from a sample. Because the measured 
variables X and Y contain random measurement error, this correlation coefficient r^y is typically 
lower than the correlation coefficient between the true scores of the variables Tx and Ty ( itx.ty )- 
As shown by Crocker and Algina (1986, Chapter 10), failure to take into account such 
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attenuation caused by measurement unreliability may potentially lead to erroneous conclusions 
about the relationships between the composites, and about measurement validity coefficients. 
Although r Tx . Tv (correlation between the true scores of X and Y) cannot be obtained directly from 
measurement data, as is shown in the measurement literature (e.g., Crocker & Algina, 1986; 
Gulliksen, 1987), the theoretical relationship between r Tx . T v , £cy, and reliability coefficients for 
composites X and Y (r^, r^) is as follows: 



So the measurement-error-free relationship between the true scores of X and Y, i.e., the 
relationship between X and Y after correcting for the attenuation caused by measurement error, 
can be expressed as: 



Latent Variable Modeling Approach for Correcting the Attenuation 

In confirmatory factor analysis where each latent factor has multiple indicators, 
measurement error is explicitly modeled in the process. The relationship between two such latent 
factors can be considered as free from the attenuation caused by the measurement error. An 
example for two related latent variables (Fx, Fy), each with four measured indicators (X1-X4, Y r 
Y4), is shown in Figure 1. In this model, ijxFv is considered to represent the true relationship 
between the two latent variables (Fx, Fy, respectively) that is not attenuated by the measurement 
error (modeled as el - e8 in Figure 1). This approach for obtaining measurement-error-free 
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relationship between factors is well-known in the area of structural equation modeling, but is 
rarely discussed within the context of measurement reliability and validity. 

Insert Figure 1 about here 

The two seemingly different approaches for correcting the measurement error attenuation 
of the relationship between two composites are typically discussed in different areas of research or 
different disciplines. As a result, it is not clear how comparable the results from these two 
approaches will be. Our literature review indicates that these two approaches have not been 
compared in empirical studies, so the merits or demerits of each of these two approaches in terms 
of recovering the true magnitude of the relationship between the two composites (factors) are not 
clear. This study was designed to shed some light on this issue through systematic comparison of 
the two approaches. Monte Carlo simulation was used as the tool for this investigation. 

Methods 

Simulation Design and Data Source 

For the Monte Carlo simulation, several potentially relevant aspects were considered: 
number of items for each composite, magnitude of inter-item correlation within a composite, 
magnitude of inter- factor correlation (i.e., correlation between the two composites), and sample 
size. For item number per composite, two conditions were considered: each factor had either four 
or eight items. For the aspect of inter- item correlation magnitude, five conditions were used (.81, 
.64, .49, .36, .25), making the inter- item correlations ranging from very high (.81) to relatively 
low (.25). The aspect of inter-factor correlation had two levels: .4, and .6. Finally, 4 sample size 
conditions were implemented: 50, 100, 200, 400. The four factors were fully crossed, making the 
total number of cells to be 80 (2 x 5 x 2 x 4). Within each cell, 500 random samples 
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(replications) were drawn from the specified statistical population, making the total number of 



replications for the Monte Carlo simulation experiment to be 40,000 [(2 x 5 x 2 x 4) x 500], 
Once the inter-item correlation was specified, the population reliability in the form of 
Cronbach's coefficient alpha could be obtained. Cronbach coefficient alpha takes the form: 



“ = rT (, -^> 



(3) 



where k is the number of items within a composite, ^cr, 2 is the sum of item variances, and o\ is 



the variance of the composite score. The variance of the composite a\ is simply the sum of item 
variances (Z o ] ) and the sum of item covariances (2Zcr. ): 

cfi =S <T . 2+2 S°'s (4) 

In this study, at the item level, normal standardized variables (normally distributed with y. 
= 0 and ct" = 1) were simulated. The covariance between two standardized variables is simply the 
correlation between them So for a composite consisting of k standardized variables with equal 
inter-item correlation coefficient of £, we have the following- 

X<7, 2 =fc, and 2£cr =k(k-\)p 



So, population Cronbach’s coefficient alphas for the composites simulated in this study are: 



OC =— — [1 



k - 1 k + k(k-l)p 



1 



(5) 



Table 1 presents the data conditions and the associated population Cronbach's coefficient 
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alphas for the composites simulated. Because inter-factor correlation (0.4 and 0.6 respectively in 

this study) does not affect the reliability of the composites, the population Cronbach's coefficient 
alphas were the same for inter- factor correlation of 0.4 and 0.6. The population Cronbach’s 
coefficient a presented in Table 1 shows that score reliability conditions examined in this study 
ranged from marginal reliability (a = 0.57) to very high reliability ((a = 0.97). 

Insert Table 1 about here 

The simulation is carried out by using the SAS system, and a combination of 
SAS/MACRO, SAS/BASE, SAS/PROC IML (Interactive Matrix Language), and SAS/STAT are 
used for accomplishing the tasks. Confirmatory factor analysis (CFA) was implemented by using 
SAS/PROC CALIS (Covariance Analysis of Linear Structures). Random normal variables were 
generated by using the random normal number generator (RANNOR) in SAS. The population 
inter- variable correlations was obtained from the two-factor model in Figure 1 based on the 
following (Joreskog & Sorbom, 1989): 



X = AOA' + 0 (6) 

where, E is the population covariance matrix (correlation matrix for our standardized variables), 
A is the matrix of population pattern coefficients in Figure 1, d> is the population correlation 
matrix for the two factors, and 0 is the covariance matrix of population residuals for the items. 
For simulating the specified population inter-variable correlations in E in Equation 6, the matrix 
decomposition procedures (see Kaiser & Dickman, 1962) were implemented. 
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Results and Discussions 

Preliminary Results 

In fitting a confirmatory factor analysis model to a sample data, we may sometimes 
encounter the problem of non-convergence, i.e., we may fail to obtain model parameter estimates 
due to non-convergence of a sample. This possibility was checked. The results show that, for the 
simple two-factor CFA model shown in Figure 1 , all samples converged, with small samples 
requiring more iterations for achieving convergence than larger samples, consistent with what was 
shown in the literature (e.g., Fan & Wang, 1998) . Table 2 presents the average number of 
iterations required for achieving convergence for different sample size conditions. 

Insert Table 2 about here 



Results for 4-Item Composites 

Figure 2 graphically presents the results for correlation coefficients between two 
composites, each consisting of four items. In Figure 2, the population correlation between the 
two factors is 0.40. The 95% confidence intervals (upper and lower limits) for three types of 
correlation coefficients between two composites are displayed, with the line in the middle of the 
bar indicating the mean of the correlations from 500 samples. These are empirical confidence 
intervals based on exact percentile points, not on standard errors. As a result, the construction of 
these confidence intervals does not assume normal distribution. The three types of correlations 
displayed are: a) the correlation coefficient of two composites without correcting for the 
attenuation of measurement unreliability (R_xy), b) correlation coefficient between two 
composites corrected for the attenuation of measurement unreliability (R_xy Corrected; algebraic 
correction based on Equation 2), and c) the measurement-error-free correlation between two 
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latent factors as in Figure 1 (RCFA). The dashed horizontal line at 0.4 represents the population 
correlation between the two latent variables. 

Insert Figure 2 about here 

Several phenomena stand out in Figure 2. First, the correlation coefficient between two 
composites uncorrected for measurement error has obvious systematic downward bias, as 
indicated by the fact that the mean of the uncorrected correlation coefficients is systematically 
lower than the population correlation of 0.4. The more measurement error the composites 
contain (i.e., the lower the reliability Rxx), the more downward bias the correlation has. When 
population correlation is 0.40, the average uncorrected correlation coefficients are 0.23 (for 
oc=0.57), 0.27 (for a=0.69), 0.31 (for a=0.79), 0.34 (for a=0.88), and 0.38 (for ot=0.94), 
respectively. By itself, this is not surprising, because this fact of downward bias is well known. 
What is surprising is the observation that the downward bias can be such that even the upper limit 
of the 95% confidence interval for the uncorrected correlation may still be lower than the 
population correlation, especially when the sample size is relatively large (e.g., N = 200, 400) and 
measurement reliability is relatively low (e.g., Rxx = .57, .69). 

Second, for the uncorrected correlation coefficient (R_xy), the confidence interval width 
(i.e., sampling variation) does not change with measurement reliability. The confidence interval 
widths of the other two correlation coefficients corrected for measurement error (R_xy 
Corrected, R_CFA), however, are related to measurement reliability: the lower the measurement 
reliability, the more sampling variation for these two types correlation coefficients. 

Third, the two approaches for correcting the attenuation of measurement error on 
correlation coefficient (algebraic correction based on Equation 2, and that based on confirmatory 
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factor analysis) worked remarkably well in capturing the population correlation, because the 
means of the corrected correlation coefficients based on these two approaches are right on the 
target of the population correlation of 0.40, for all the five conditions of measurement reliability 
considered in this study. This shows that these two correction approaches produced unbiased 
sample estimates. 

Fourth, when sample size was small (e.g., N = 50), some slight differences occurred in the 
confidence interval widths based on the two correction approaches. But the results from the two 
correction approaches generally showed highly consistent results, both in terms of their almost 
identical means, and in terms of their almost identical confidence interval widths (upper and lower 
limits). 

Figure 3 displays the confidence intervals of the three types of correlation coefficients 
between two composites when the population correlation between the two is 0.60. Here, the 
results essentially replicate those in Figure 2. The downward bias of the uncorrected correlation 
coefficients between two composites, however, appears to be more severe. When population 
correlation is 0.60, the means of the sample correlation coefficients uncorrected for the 
attenuation caused by measurement error are 0.34 (for oc=0.57), 0.41(for a=0.69), 0.47 (for 
a=0.79), 0.52 (for oc=0.88), and 0.56 (for ot=0.94), respectively. When measurement reliability is 
relatively low (e.g., a = 0.57, 0.69), even the confidence interval's upper limit of the sample 
correlations may be quite a bit lower than the population correlation of 0.60, let alone the mean of 
the uncorrected sample correlations. In some situations, the upper limit of the uncorrected sample 
correlation confidence interval does not even overlap with the lower confidence limit of the 
corrected sample correlations (e.g., N = 400, a = 0.57, 0.69). This suggests that the attenuation 
on sample correlation efficient caused by measurement error may be more severe than many 
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researchers realize. In many research situations, it is not uncommon to have measurement 
reliability in the range of 0.60-0.80. Under such conditions, even the upper confidence interval 
limit itself may fail to capture the true correlation between two composites. 

Insert Figure 3 about here 



Figures 4 and 5 present the findings for correlations between two 8-item composites, for 
inter-factor correlation of 0.4 (Figure 4) and 0.6 (Figure 5) respectively. The general 
observations in these two figures are closely comparable to those discussed above for the situation 
of 4-item composites, thus not repeated here. 

Insert Figures 4 and 5 about here 



The findings in this study suggest several things related to our research practice. First, it 
is important to report measurement reliability in a research study, even if the study does not focus 
on measurement issues. As discussed by Wilkinson and The APA Task Force on Statistical 
Inference (1999), “ . . . authors should provide reliability coefficients of the scores for the data 
being analyzed even when the focus of their research is not psychometric. Interpreting the size of 
the observed effects requires an assessment of the reliability of the scores” (p.596). It is obvious 
from the figures presented above that reporting measurement reliability helps readers to evaluate 
the magnitudes of correlation coefficients reported in a study. 

Although reporting measurement reliability appears to be a simple task that makes 
common sense in research, this is still far from being actual common research practice. Two 
decades ago, in a review of the articles in the American Educational Research Journal (AERJ), 
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Willson (1980) reported that less than 50% of the published studies did not report measurement 
reliability, and only about 37% of the studies reported score reliability coefficients of their own 
data. Willson commented that . . reliability is unreported in almost half the published research 
is . . . inexcusable at this late date" (pp. 8-9). More recently, Yin and Fan (2000) reported that 
only about a dismal 8% of the published studies involving the use of the Beck Depression 
Inventory actually reported reliability coefficients for their own data. The fact that only such a 
small percentage of studies reported their measurement reliabilities "... shows that the concept of 
test score reliability has not generally prevailed . . . and research practice . . . still leaves much to 
be desired" (Yin & Fan, 2000, p. 210). 

Reporting measurement reliability for data used in analyses is more than a psychometric 
concern, however. As clearly shown in Figures 2 to 5 in this study, measurement reliability is 
directly related to our interpretation of statistical analysis results. As Thompson (1994) 
discussed, "the failure to consider score reliability in substantive research may exact a toll on the 
interpretations within research studies. For example, we may conduct studies that could not 
possibly yield noteworthy effect sizes given that score reliability inherently attenuates effect sizes. 
Or we may not accurately interpret the effect sizes in our studies if we do not consider the 
reliability of the scores we are actually analyzing." (p. 840). 

Second, since both correction approaches work so well in providing unbiased estimates 
for correlation between composite measures, it appears reasonable that correction for correlation 
attenuation caused by measurement error should be done whenever measurement reliability 
coefficients are available. In applied research practice in education and psychology, it is not 
uncommon to have somewhat low or moderate measurement reliability (e.g., in the range of 0.60- 
0.80). As shown in several meta-analytic reliability generalization studies (e.g., Capraro, Capraro, 
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& Henson, 2001; Caruso, 2000; Yin & Fan, 2000), measurement reliabilities in psychological 
studies were often as low or lower than 0.60-0.80. As shown in Figures 2 to 5, when 
measurement reliability is in this range or lower, the downward bias of the uncorrected correlation 
between two composite measures could be substantial. It is possible that such attenuation of 
correlation caused by measurement error may have camouflaged many meaningful relationships 
from researchers.' 

It should be pointed out that, although the two approaches for correcting attenuation 
caused by measurement unreliability produced highly comparable results, this does not mean that 
both approaches are readily applicable in all situations in which there are composites with multiple 
items. In general, it is typically difficult to model item level data in practice. Gorsuch (1997, p. 
315-316) discussed several reasons for the difficulty of modeling item level data in practice: a) 
items often have low reliability, b) item often contain confounding variance in addition to the 
construct being measured, (c) item distributions often differ from each other, d) item scores are 
typically a set of ordered categories rather than continuous. For this reason, the approach of 
algebraic correction for attenuation based on Equation 2 is more readily usable in research 
practice. In this sense, these two approaches may not be comparable in actual application, despite 
their comparable results for the ideal data conditions considered in this study. 

Summary and Conclusions 

This study focused on the issue of measurement reliability and its attenuation on 
correlation between two composites. The simulation design allowed systematic comparison of the 
traditional approach for algebraically correcting for attenuation of relationship caused by 
measurement error, and the modeling approach based on confirmatory factor analysis in which 
measurement error is specifically modeled. The study provides useful information for better 
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understanding of the issue of attenuation caused by measurement error, and about two different 

approaches for correcting such attenuation. Four factors were considered in this study: 
composite reliability, number of items comprising the composites, inter-factor correlation, and 
sample size. Within each cell condition, 500 replications were conducted to estimate the sampling 
distributions of the uncorrected and corrected correlation coefficients. The findings show that, as 
expected, correlation coefficients uncorrected for measurement error are systematically biased 
downward. The magnitude of such downward bias is related to measurement reliability of the 
composite in reverse direction: the lower the reliability, the larger the magnitude of the downward 
bias. When measurement reliability is low or moderate (e.g., 0.60-0.80), not only the average of 
such sample correlations may be substantially lower than the population parameter, but even the 
upper confidence interval limit of the uncorrected sample correlations may fail to capture the 
population correlation. 

For the data conditions considered, the two correction approaches provided not only near 
identical and unbiased means , but also near identical confidence intervals for the sampling 
distribution of the corrected correlation coefficients. The highly comparable results from the two 
correction approaches suggest that these two approaches work equally well for these data. It is 
pointed out, however, that the CFA modeling approach may be less applicable in research practice 
due to more difficult data conditions at the item level in research practice. The findings in this 
study point to the importance of reporting measurement reliability information in research practice 
whenever possible. The findings further suggest that correction for attenuation should be 
considered when information about score reliability is available. 
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Table 1 Data Conditions and Reliability for the Composites 
(Inter-Factor Correlation is 0.4 and 0.6) 



Number of Items . 
in Composite 


Inter-Item 

Correlation 


Composite 
Reliability (a) 


4 


0.25 


0 . 5714 




0.36 


0 . 6923 




0.49 


0 . 7935 




0 . 64 


0 . 8767 




0 . 81 


0 . 9446 


8 


0.25 


0.7273 




0.36 


0 . 8182 




0.49 


0 . 8849 




0.64 


0.9343 




0.81 


0.9715 
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Table 2 Number of Iterations for Achieving Convergence in CFA Modeling 



N 


Average 


Minimum 


Maximum 


4-Item 








Composites 








50 


12.78 


4 


50 


100 


8.61 


4 


50 


200 


6 . 53 


3 


33 


400 


5.22 


3 


15 


8-Item 








Composites 








50 


9 . 71 


5 


50 


100 


7 .09 


5 


50 


200 


5.77 


4 


11 


400 


4.82 


4 


8 
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Figure Captions 



Figure 1 A Correlated Two Factor Model with Four Indicators for Each Factor 



Figure 2 95% Cl for Correlations between 4-Item Composites: Uncorrected (R_xy), Corrected 

for Unreliability (R_xy Corrected), and that from CFA Model (R_CFA) - Inter-Factor 
p = 0.40 



Figure 3 95% Cl for Correlations between 4-Item Composites: Uncorrected (R_xy), Corrected 

for Unreliability (R_xy Corrected), and that from CFA Model (R_CFA) - Inter-Factor 
p = 0.60 



Figure 4 95% Cl for Correlations between 8-Item Composites: Uncorrected (R_xy), Corrected 

for Unreliability (R_xy Corrected), and that from CFA Model (R_CFA) - Inter-Factor 
p = 0.40 



Figure 5 95% Cl for Correlations between 8-Item Composites: Uncorrected (R_xy), Corrected 

for Unreliability (R_xy Corrected), and that from CFA Model (R_CFA) - Inter-Factor 
p = 0.60 
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Figure 1 A Correlated Two Factor Model with Four Indicators for Each Factor 
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